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Abstract 

The Cauchy-Schwartz majorization of the distance function in SMACOF is replaced 
by a majorization of the squared distance function. This leads to an interesting SMACOF 
alternative, which we call SMOCAF. 
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Note: This is a working paper which will be expanded/updated frequently. All suggestions 
for improvement are welcome. The directory gih.stat.ucla.edu/cmds has a pdf version, the 
bib hie, the complete Rmd hie with the code chunks, and the R source code. 


1 Introduction 

The (Euclidean, metric, least squares) multidimensional scaling or MDS problem is a special 
case of a nonconvex optimization problem in which we minimize a function of the form 

i n _ 

f(x) = l + -x'x - J2 y x ' A i x , 

“ i =1 

where the A* are given positive senri-dehnite matrices. Because the third term on the right, 
the one with the square root, is convex, the problem is a DC-problem (Le Thi and Tao (2018)), 
i.e. the minimization of the difference of two convex functions, one of which is quadratic 
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and one of which is not differentiable at some points. We do not really have to worry about 
non-differentiability, because of the result in De Leeuw (1984). 

2 SMACOF 

The SMACOF algorithm (De Leeuw (1977), De Leeuw and Heiser (1977), De Leeuw and 
Mail' (2009)) to minimize the MDS objective function / is a majorization algorithm (De 
Leeuw (1994)), currently more commonly known as an MM algorithm (Lange (2016)). It is 
based on the Cauchy-Schwarz inequality 


y/x'AiX y'Aiy > x'Aiy. 

Define 

A (( x'AiX ) _ 5 if x'AiX ^ 0, 
e i\ x ) ~ 1 

I 0 otherwise . 

and 

A 1 n 

g(x , y) = 1 + -x'x - e i(y)x'Aiy, 

1 i=\ 

then we have the basic majorization relations fix) < g(x,y) and f(x) = g(x,x). Thus step k 
in the iterative algorithm minimizes g(x,x^) over x, which leads to the convergent algorithm 

n 

x (k+i) = ^ e .( s W) i 4./) i 
i =1 


3 SMOCAF 

For our alternative majorization we use the inequality 

x'A^ > 2 x'A^ — y'Aiy. 

Define the polyhedral convex set C(y) of all x such that 2 x'Aiy — y'A(y > 0 for all i. Clearly 
y G C(y), and thus C(y) is non-empty. Also define 

h(x, y) = 1 + -x'x - Y, ~ y'A^, 

1 i=i 

for all x G C(y). Because the square root of a non-negative linear form is concave, minimizing 
h(x, x^) over x G C(x is a convex problem with linear constraints and leads to a convergent 
MM algorithm, which we call SMOCAF. 

Of course a SMOCAF step is inherently more complicated than a SMACOF step. The idea 
is interesting, at least to me, because the SMOCAF algorithm solves a sequence of convex 
programming problems in which the constraint set changes over iterations. 
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4 Example 

We try out both SMACOF and SMOCAF on a simple example, where there are 10 matrices 
Ai, all covariance matrices of a 20 by 5 matrix filled with random standard normals. The 
SMOCAF iterations are implemented using constrOptimO from base R, which uses a 
logarithmic barrier method for the convex programming step. Since we have very good initial 
estimates, and we do not really have to iterate until convergence within each step, I assume 
the SMOCAF implementation can be made much more efficient. In addition, acceleration 
techniques specific to DC programming, such as the one discussed in Artacho, Fleming, and 
Vuong (2017), conld be applied. Or general acceleration techniques, discussed recently in 
great detail in Sidi (2017). But optimal speed is not what interests us here. 

system.time(print(smacof (a, rep(l,5), verbose = FALSE))) 

## $itel 

## [1] 116 

## 

## $stress 

## [1] -61.77740087 

## 

## $coef 

## [1] 0.7270085698 6.3676564611 8.5480388387 -2.1757084262 2.1625300759 

## 

## $grad 

## [1] -7.316141845e-07 -4.745316848e-06 4.648248850e-06 5.576051584e-06 

## [5] 1.455168160e-06 

## user system elapsed 
## 0.043 0.005 0.051 

system.time(print(smocaf (a, rep(l,5), verbose = FALSE))) 

## $itel 

## [ 1 ] 120 

## 

## $stress 

## [1] -61.77740087 

## 

## $coef 

## [1] 0.727008322 6.367654855 8.548040412 -2.175706538 2.162530569 

## 

## $grad 

## [1] -7.554717054e-07 -4.899925868e-06 4.799695000e-06 5.757723983e-06 

## [5] 1.502571097e-06 

## user system elapsed 
## 0.210 0.013 0.224 
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For both majorizations convergence is nice and smooth, and both converge to the same 
solution. The number of iterations is about the same, but since SMOCAF iterations are 
considerably more time-consuming the overall SMOCAF time is about 10 times the SMACOF 
time. 


5 Appendix: Code 

5.1 smacof.R 

smacof <- 
function (a, 

xold, 

itmax = 1000, 
eps = le-10, 
verbose = TRUE) { 
n <- dim(a) [3] 
m <- dim(a) [1] 
dold <- rep(0, n) 
itel <- 1 
for (i in l:n) 

dold[i] <- sqrt (sum (a[, , i] * outer (xold, xold))) 
sold <- sum (xold 2) / 2 - sum(dold) 
repeat { 

xnew <- rep (0, m) 
dnew <- rep (0, n) 
for (i in l:n) 

xnew <- xnew + drop(a[, , i] °/ 0 *°/ 0 xold) / dold [i] 
for (i in l:n) 

dnew[i] <- sqrt (sum (a[, , i] * outer (xnew, xnew))) 
snew <- sum (xnew 2) / 2 - sum (dnew) 
if (verbose) 
cat ( 

"Iteration: ", 

formate (itel, width = 3, format = "d"), 

"sold: ", 
formate ( 
sold, 

digits = 8, 
width = 12, 
format = "f" 

), 

"snew: ", 
formate ( 
snew, 
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digits = 8, 
width = 12, 
format = "f" 

), 

"\n" 

) 

if ((sold - snew) < eps I I (itel == itmax)) 
break 

sold <- snew 
xold <- xnew 
dold <- dnew 
itel <- itel + 1 

> 

g <- xnew 

for (i in l:n) { 

g <- g - drop(a[, , i] 0 / 0 *°/ 0 xnew) / dnew[i] 

> 

return(list ( 
itel = itel, 
stress = snew, 
coef = xnew, 
grad = g 

)) 


5.2 smocaf.R 

obj <- function (x, y, a) { 
f <- sum (x ~ 2) / 2 
for (i in 1:10) { 

f <- f - sqrt (abs (sum (a[, , i] * outer (2 * x - y, y)))) 

} 

return (f) 


grad <- function (x, y, a) { 
g <- x 

for (i in 1:10) { 
g <" 

g - (1 / sqrt (abs (sum ( 

a[, , i] * outer (2 * x - y, y) 
)))) * a[, , i] %*% y 

} 

return (g) 
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smocaf <- 
function (a, 

xold, 

itmax = 1000, 
eps = le-10, 
verbose = TRUE) { 
n <- dim (a) [3] 
m <- dim(a) [1] 
dold <- rep(0, n) 
itel <- 1 
for (i in l:n) 

dold[i] <- sqrt (sum (a[, , i] * outer (xold, xold))) 
sold <- sum (xold 2) / 2 - sum(dold) 
repeat { 

uold <- matrix (0, n, m) 
cold <- rep (0, n) 
dnew <- rep (0, n) 
for (i in l:n) { 

uold[i, ] <- 2 * a[, , i] °/ 0 *7 0 xold 

cold[i] <- sum (a[, , i] * outer (xold, xold)) 

} 

h <- 

constrOptim (xold, obj, grad, uold, cold, y = xold, a = a) 
xnew <- h$par 
for (i in l:n) 

dnew[i] <- sqrt (sum (a[, , i] * outer (xnew, xnew))) 
snew <- sum (xnew 2) / 2 - sum (dnew) 
if (verbose) 
cat ( 

"Iteration: ", 

formate (itel, width = 3, format = "d"), 

"sold: ", 
formate ( 
sold, 

digits = 8, 
width = 12, 
format = "f" 

), 

"snew: ", 
formate ( 
snew, 

digits = 8, 
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width = 12, 
format = "f" 

), 

"\n" 

) 

if ((sold - snew) < eps I I (itel == itmax)) 
break 

sold <- snew 
xold <- xnew 
dold <- dnew 
itel <- itel + 1 

> 

g <- xnew 

for (i in l:n) { 

g <- g - drop(a[, , i] xnew) / dnew[i] 

> 

return(list ( 
itel = itel, 
stress = snew, 
coef = xnew, 
grad = g 

)) 

} 
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